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I. AN INTRODUCTION 


Although the apelication of scftware database management 
systems to user regquiremerts is not new there are emerging 
specialepurpoese hardware systems whicn wlll relieve tne nest 
Gemeral’ processing unit (CFU) from tne time consuming 
processes of accessing, uvuedatirg, and medirying data, 
Nurercous, cemmerciallyeavailarle software database 
management systems for the host comouters are currently 
employed in arplication areas but there appears to ce 
associated cerformance degradation in tne host machines, 
These performance issues must ce identified and performance 
measured in order to grevided seme quantitative comcearisons 
between software systems, generalecurcose hardware systems, 
and special-curpose hardware systems, Hiseorically this 
information has been collected fer generalefuroose computers 
by the use of the instruction mix (GibdSon or Flynn) to 
measure performance in varicus categories, This measurement 
cf a machine using an instructicn as atool mix is callea 
benchmarking, 

The task cf benchmarking a datarase system has not been 
developed in tne literature, Consequently, a research 
Project has ceen undertaken by the Naval Fostgraduate Scneol 
to develop a set of benchmarking standards which can te 


employed to cbotain a rperfermance {index of a particular 
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database machine/system and further used in a comfarative 
analysis with resrect to cther datacase systems/machnines, 

The initial steps in the benchmark development have been 
limited to a srecific relatienal database macnine, In 
addition to the measurements of Specitic database 
operations, a question of the rele and responsibilities of 
the database administrator (DBA) is posed, with each system 
benchmarked, there is a&@ need te eStabclish the amount of 
support provided to DBA. In this case an examination cf the 
facilities erovided, query language employed, and amount ot 
additional DBA supfrert required {is conducted, 

The objective of this thesis is to categorize the duties 
and resronsibilities of CBA ard descripe new they are 
suprorted by the benchmarked system, At the beginning, the 
system envirenment is descrited, fcllowed by a discussion of 
the query language, An analysis cf DRA functions is then 
made and finally, the fully relational] model is examined ane 
a comparison of this farticular query language witn another 
well=Known language is made, 

teas thesis fs one in a séries cf four deScribcing the 
current status of the benchrark development, The other 
three tooics are an generating tne synthetic dgatabrase (Ref, 
1), selection and projecticn [Ref,. 2), and join oreraticns 


fret, 3). 


iy 





II, THE BENCHMARKING ENVIRONMENT 





A. THE HOST SYSTEM 
The host machine for the eenchmark is Univac 1100/42 
located at Pacific Missile tlest Center, Peint Mugu, 
California. In addition te ecresite eauipmrent, a remete 
terminal is installed at the Naval Fostogrdéduate School, 
i. The Hardware Interface 
The Nardware interface between tne host and the 
database machine is through a Univae 1100/42 I/O channel. 
This interface channel has a 200=thousand Eyte/seccnd 
capacity and the transmissicnrn unit is either a byte or a 
word. 
2e The Software Interface 
The host software is written by Amverif Corporation 
of Chatsworth, Califernia. This software consists of the 
hostedriver routines whose primary purpose is to rarse the 
queries and to translate thet into the database facnine 
lanquage, Finally, the host handles the communicaticns 


protocol] between the database machine and itself, 


Be THE SACKEND DATABASE MACHINE 


ie )6 OCA MOM Uibear Cesian 





Tne database machine which interfaces with the hast 


{s an IDM 500 ranufactured py Britton=Lee Incorporated of 
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bes Gatos, California. I06M $00 is being tarketed by Amrertf 
Corporation as an ROM 1100, It 1S a moduler, excandacle, 
and microprocessorebased syster organized areund a central 
hignespeed bus, The separate medules are functionally 
orlented, The RDM 1100 emrpleys the relational database 
model which will ce discussed in detail in Section V. 

The datarase processcr (Z80U00etased microrrocesscr) 
supervises and manages all Systen resources, This frocesscr 
executes most cof the software in the system, 

The database accelerator is an ortional, hicn=speed 
Pprecessor with an instructicn set Specifically desiqned te 
perform certain relational database functions. The 
accelerator nas a2 threeestace pireline which executes 
mmeeructicons 4t ur to 10 MIFS. This processor can initiate 
disk activity and can process data at disk transfer rates, 
The accelerator and the RDM 1100 software are so contigured 
that the most frequently cecurring database work ts 
performed by the accelerator under the direction cf the 
database processor. 

The cache memory (1.e., main memory) of ROM 1100 is 
composed cf 64Kebit echics cf dynamic RAM, It may se 
expanded to a maximum of six mecarytes,. This cacne is 
utilized tor kDOM 1100 system cede, disk caching, indices, 
and user commands, 

Disk Controller modules may be expanded from one to 


four, Each contreller can franage from cne to four daisk 
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Gtmieees. ING disk controller meves data between the disks 
and the cache memory, and is designed to work with the 
accelerator. An ogtional tape control module supports up to 
eight tape drives which can be used for direct, disk-to-tace 
backup, data, and software loading. 

RDM 1100 and the host(s) communicate with each other 
Via RDM 1100°%s hosteinterface medule. Ihnis module accerts 
commands from one or more nests, and acts on those commands 
accordingly. Eacn hmosteinterface module can handle up to 
eight hosts and a maximum of eight hnost-interface modules 
can be made available on ROM 1100, Hence, a maximun of 64 
Hosts can be accommodated ry REM 1100, im ~adeait Loneas.to 
communications handshakina erectccols, the interface medule 
performs necessary error cheeks and causes tne host to 
retransmit any information eleck in whien an error is 
detected, (Ref, 4] 

2. The System Configuration 

In tne confisuraticn descriped above (i.¢., the 
connection of the host and the database machine with an I/0 
channel), the datarase machine is called &@€ backend datarase 
machine, The term, ‘’brackend’, is used in this context to 
refer to a specialecurpose machine cperating as a peripneral 
device on one or more Rest systems, As previcusly 
mentioned, the use of the tackend tracnine can significantly 
reduce the required CPU time fer data manipulation by the 


host. Further advantages are realized througn freeing disk 
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space om ©NMEP host and the reduction of I/0 cycles; thus 
releasing the CFU to perform cther functicensS necessary for 
the proper operation of the system and execution of 
applications rfrogrems, 

The performance of ROM 1100 is nignly dependent on 
the available hardware ccnfiguration,. Other performance 
issues such as indexing and cata fresitioning are dependent 
on the software developed, Ihe hardware configurations are 
discussed below and tne software issues are discussed in 
section IV, 

Four test ceonfiguraticns are used during the course 
eof this research, Tne initial centiguration {ts of oneewnalf 
megabyte of cache without the accelerator. as 
configuration will not be marketed cy Amperif, cut is tested 
for the purpose of compariscn, The next test configuration 
is of oneemegabyte of cache with the accelerator, Following 
it, a twoemegabyte cacne with the accelerator is tested, 
Finally, the accelerator is removed from the configuration, 
The configuration is tested with only the two=meqatyte 
cache, The standard commercial configuration is with onee 
megabyte of cache. The acceleratcr is an optional feature, 
For specific information ¢n the performance measurements, 


the reader is directed to (kKef,. 2] and (Ref. 3), 


ae 


III, THE RELATICNAL CUERY LANGUAGE 


A. AN INTRODUCTICN TC THE LANGUAGE 

In addition to the nardware and software to suprort the 
host/backend interface, Amrperif also provides a language fer 
recuesting information cr cperations on data from the 
backend datatase machine. Tris language is called the 
Relational Guery Language (RGL). Tne lanquage, being tne 
only interface for the user and datébase administrator 
(DBA), is the sole means Fey which the capabilities ana 
limitations of the backend are Known to the user and DBA, 
Therefore, a discussion about the facilities of RCL will ce 
presented, 

This section defines two major command groups availatle 
in RGL. The metanotaticn used in tne commana syntax 


consists of the symrtols descrited celow, 


¢C ) used as delimiters in R@QL 

Gc) ) used to indicate anytnina eortional inside 
the square tErackets 

wseid CO dencte 4a chicecice of the word .eltner 
before or after the car 

{ } used to specify zero or more occurrences of 
anything in the curly brackets 


< > used as metasymecls to denote a construct in 
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ROL with the namre of the construct retween 


the metasymrols 


All other wores in RGL @re key words and must appear 
literally (Ref. 5). In the remainina sections Key words are 
capitalized, In sections exglaining the commands, an 
abbreviated syntax of each command followed by an 
explanation of the command is croviaed, tTIhis information is 


taken from (Ref, 5] and (Ref. 6). 


Be. DATA CEFINITICN CCMMANCS 

In ROL the cormands are fresented without regard to 
function, However, in most dataktase vooks, (@.9., [kef. 7] 
ana (Ref. 81), there is a e¢4istinction between the data 
definition language and the data manipulation language. 
Mathough this distinetion is not made in RGL, it provides a 
logical division of the majcrity cf commands and facilitates 
the understanding of tne commands, The cate definition 
language consists of those cemmands wnich are used for the 
deseriction of database otjects, 

Data can te represented in seven different tyces in RDM 
1100, Tne twoecharacter srpecificaticns available are for 
the compressed (¢c) and uncomeressed (uc) Character string 
with the user providing a maximum length, ue to 255 
characters. The difference is tnat tne compressed character 
string is not stered with trailing blanks, Integers can te 


declared witn three different byte sizes namely, i, 2, or 4, 
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Pie soveersize Of an attribute limits the precision which can 
be accommodated in the attribute Values, Finally, 
floating=froint numbers can also be expressed as compressed 
(£) or uncompressed (uf). The range provided by tnese two 
Pomme, is identical and cf 31 significant digits. As in 
character strings the user must specify the number of 
significant digits desired. ine difference cetween 
compressed and uncompressed floatina point is the 
sugperession cf leading and trailing zeros in the compressea 
floating point, Compressicn is a feature designed to reduce 
the storage requirement in the database, Ine follewing 


declaration is an example of the use of attribute types: 


name = ¢25, salary = uf8, aqe 32 ji, address = c200, 


Tnis example estactliisnes four attributes: ‘name’ whese 
values can each consist cf up to 25 characters, ‘Salary’ 
whose values are floating=eccint numcers each of which is cf 
eight significant digits, ‘’age’ whose values are cne=byte 
integers, and ‘address’ whose values are character strings 
of up to 200 characters each, Notice °name’ and ‘address’ 
are designated as ccmeressec and therefore trailing planks 
of their values are not stored, 


Ue To Create a Daterase 





CREATE CATABASE <name> [WITH <opctions>] 
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This command {is used te esStarlish a datatase which 
will be referred to by the userespecifled name, The two 
ertions orovided are DISK and CEMAND, DISK allows the 
specification of one or mere disks on which the datakase 
will be stored (e.9., DISK = ’sys’).e DEMAND specifies the 
number of 2Kebyte blocks to ce allocated for the database, 
If the database grows beyond the allocated clocks, it may ce 


extended with the following cermana: 


EXTEND CATABASE <name> anllK <ootions> 


The options are identical te the cptions of CREATE CATABASE, 


2e Io Create a Relaticn 


CREATE RELATICN <nared (<attrioute named = <format> 


{, <attribute namr2> = <format>}) ([wITH <opticons>]) 


The create command is used to establish the se¢enema 
for a relation, An emety relation is set up in the database 
when the command 1s executed with the actual specification 
of the attributes in parenthesis cteing aiven as dericted in 
the example of data tyres arcve, CGne possicle option whicn 
may pe declared is LOGGING. wTInis option causes every cnange 
to the relation to Fe logged in tne datapase transaction 
TOG « This feature is extremely important to maintain the 
consistency and integrity cf the relation when system 


recovery must be initiated, 
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3. To Create an Index 


en a ee eo ae 


CREATE ({UNIGUE) [(NCNCLUSTERED | CLUSTERED] INDEX 


(CN} <oprject name> (<attribute> {, <attribute>d)) 


An index on an attribute cf a relation erovidaes a 
direct access to the attribute values in the relation. A 
unigue index on an attribute requires all attribute values 
to be different. There are twe primary aifferences between 
clustered and nonclustered indices. A clustered index is 
nondense (i.@.e, one entry/blceck) whereas the nonclustered 
index is dense (1i.@., one entry/tuple), The seccna 
difference relates to the storage of aata, Althougn the 
nonclustered index does net affect the placement of data, 
the clustered index requires tne tuples of the relation to 
be stored in the order of the eturicute Values, 
Consequently, only one clustered index may be created for a 
relation whereas 250 noneclustered indices may be defined for 
the same relation. For performance data on ocerational 
enhancement provided py indices, see (Ref. 2] and (Ref, 3). 


4 To Create a View 


CREATE VIEw <view named (<target list>) 


(WHERE <aquelificaticn>] 


Tne CREATE VIEW command establishes a Virtual 
relation, i.e., there is no storage of tuples associated 


with the view, <A view is a cemcosite relation (without its 
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Ofm tuples) of attributes: from other relations or views. 
Moemearget list is the list of attributes desired trom the 
etrner relations or views. Finally, tne avalification allcws 
time, us@r to restrict tne quantity of data in the view to a 
particular category and to provide necessary linkaces 
between the relations or views, 


ae Toe Define a Stored Ccnmaned 


DEFINE <stored command ranre> 
<command> {<cemmand>d)} 


END DEFINE 


In KGL the CEFINE commard provides a mechanism for 
creating subroutines in the database machine, Stored 
commands may have parameters cr ce fparameterless, Tne 
<command> can be an APPENC, CELETE, REPLACE, RETRIEVE, ete, 
(to be discussed later), There are two aavantages to stored 
commands, One is that it relieves the operator of retyring 
a frequently empelcyed cemmard ard allows tne DBA to provide 
@ a Simplified method fer inveking complex queries, The 
second and -ferharps most inrecertant advantage is the 
performance enhancement. Since the storeq command exists in 
the database with all addresses of cited relations resolved, 
the communications between the host and the backena macnine 
is reduced toe passing an EXEC tcken and the command nare, 


Examples cf stored commands are erevided in Aprendix A, 
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6. 10 Destroy a Patavace 


DESTRCY DATABASE <named 


The CESTRCY DATABASE command eliminates the antire 
database by removing all linkages from ROM 1100 and freeing 
the storage space, 


Pa to D¥@stroy an Object 
DESTRCY <object name> 


This command eliminates existing relations, 
established views or stored ccmmands from the database, The 
space freed by the command is reusable by the database, As 
indicated ereviously, views and Stored commands depend cn 
existing relations or views. These underlying objects are 
Said to nave dependencies, An cbject which has dependencies 
cannot be destroyed without first destroying the dependent 
object. This G60@S not arely to indices, wnichn are 
automatically destreyed wher tne relarion is destroyed, 


8. To Destroy an Incex 


DESTRCY (NCNCLUSTERED | CLUSTERED] INDEX [ON] 
<oeEject name> (<attricute named 


{, <attribute namre>}) 


If an index is unnecessary Or the overhead 
associated with keeping an index is to high, the index may 


be deleted from a database ey the CESTROY INCEX command, In 
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addition to the ceqject name, the user must also specify the 
exact attributes of the index fer the purpose of avoiding 


any ambiguity. 


Ce DATA MANIPULATICN COMMANCS 

The data manirulation language is that subset of RGL 
commands which allows the user to access, update, and 
retrieve the data stored in the database, 


ie To Retrieve Data 





RETRIEVE (C(UNIGUE] (<target list>) (QRDER (EY] <order 
specification> (C:A | DJ 
{, <order specification> (:A |! B)}) 


(WHERE <oualification>o] 


The RETRIEVE command is the most commonly enployed 
command in KGL. It is the mears cy whieh data is extracted 
from the datakase and returned to the user, The target list 
provides the user with tne facility to reduce the amount of 
data py limiting the number cof attribute values requested, 


The format for the target list is: 


relation name.,attribtute name 


[f, relation.name,attributetname], 


This list of attrinrutes can ce frem one or more relations, 
To reduce duplicate Information, UNIGUE can be employed, 


Tne order specification dictates the order (Lea, 
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alphabetic, numeric or @lrhenumeric order) in which the data 
is to be sorted. Finally, the cualification allows tne user 
to specify redicates which the data must satisfy and to 
require linkages between relaticns. Tnese predicates and 


linkages reduce the number cf tuples retrieved, 


ze To Aggend new Tuples 


APPEND (TC) <relation name> (<value list>) 


CWHERE (<qualification>)] 


The APPENC command allews the user to add tuples to 
meeececific relation, The value,» list must specify the 
attribute names and attribute values with an eauality sign 
in between. Unlisted attribute values in the value list are 
assigned default values (1.@., clanks for characters ana 
zeros for numerals). 


3. To Reclace Attribute Values 


REPLACE <relation nemed (<value list>) 


(WHERE <qualification>) 


REPLACE provides the facilities for updating values 
Stored in the datatase, Although it can only cnange cne 
relation at a time, the numrer cf attribute values is net 
limited, Further, more thar one relation can be accessed to 
ca@lculate what is to be updated, Although a view name ray 
mem used in place of tne relaticn name in REPLACE and APPEND 


commands, the numercus restrictions on the acceptability of 
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ens procedure makes it almost impotent and at best 
infeasible, 


4. To Celete Tuyuples 
DELETE <relation name> [WHERE <qualification>] 


This command is used tc remove one or more tuples 
prom a» relation, Iif@eareonerke clause iS not ispecified then 
ar tuples will be deleted, 

Se. lo Aggregate Attribute Values 

There are six scalar aggregates supplied in RL 
whiecn may be applied to one or rore attribute values, These 
aggregates return a single value, known as tne scalar, to 
the user, Tne results of MIN and MAX are the smallest ana 
largest attribute values feuna £Or tne attribute, 
respectively. SUM and AVG erevide the arithmetic total and 
mean of the respective attripnute values. COUNT returns the 
fimeer of cccurrences of the sceciftic attrioute value. ANY 
TS Used te test for the existence of a specific attribute 
value, This is accomelished cy applyina ANY to a condition 
(e.g., ANY = (relation.name,attribtute.narme = value)). Tt 
the condition is true for at least one attribute value a °1’ 
is returned, °0° otherwise. Any s¢alar agaregate can have a 
predicate (qualitication) and, since it returns a single 
value, can be used anywhere a scalar value is permissipnle in 
an expression or cther precicate, wUNIGUE can be used witn 


COUNT, SUM, and AVG to avoid including duplicate entries in 
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the computed scalar value. Fer example, COUNT UNIGUE can be 
used on a personnel database to retrieve the nurcer of 
different states (assuming eirthplace is an attribute) 
represented by the employees clace of birth without regard 
to the actual number of emcloyees from each state. Ihese 
scalar aggregates are useful in providing statistics atcut 
the database and in isolating tuples whose attribute values 
are numeric, For example, a query can be comrosed to 
provide a list of attricute values such tnat each value {is 
greater than the average of the values. 
6. Aggregate Functions 

The term “’function’ is misleading when used in this 
context since the results cf arplying an agaerecate function 
MS>a iist of sealers. aAithcugh this is not the generally 
accepted concept of a function (returninoe a single value) In 
the literature, it will centinue to be used in this thesis. 
Aggregate functions are used in conjunction saitn the “’greup 
by” (EY) clause. This clause crevides a ecartition of the 
attribute values. The partitioned values can then ve used 
as arguments cf an aggregate functicn, Tnere can be fore 
than one aggregate functi¢n in a query, and agcqregate 
functions may be nested. Additienally, aggregates tay 
appear in both the target list and qualification. An 
example of the application cf an agqgresate function can te 


found in Section V, 
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Tne aggregate functions efrovide the comrutational 
power of RGL, without the funetions there would ce no easy 
method of dividing attribute values into sets and cerforming 
tests or computaticns on these value sets, Further, the use 
of aggreaqate ‘funetions relieves the user from creating 
numerous temporary relaticns and from manipulating them 
individually for the desired result. For example, in a 
personnel relation with selaries and department numcers as 
attributes, it may be desirakle to compute the average 
Salary of each department. This is easily accomplished cy 
the use of the aggregate functicn in the target list as 


follows: 
answer = AVG (Salary EY dept.no), 


If this capabllity were not available, some other form of 
partitioning would ce required to suppert the query. Cne 
mignt provide a separate retrieve for each department 
nurber, form a temporary relation for the retrieved salary 
figures, and average on the newly formed th acton 
separately. 
7. StringeManicfulation Functions 

inemenaer tee Maintain the simete format of a 
relational system and yet ecrevide tne capability to obtain 
data based on partial or comtined attributes, RQL tineludes 
three string manifulation functions, The most useful of the 


three for the user arpears to be fattern matching, Ey using 
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symbols to represent any number of characters, turles having 
a aesired internal pattern in a specific attribute can tke 
selected, Again, consider a fersonnel relation with an 
attribute of date of birth. The pattern matching function 
could sce used to provide a list of all personnel porn in a 
particular month, Pattern matening applies only GG 
characters and is only used in credicates, 

The other two functions are CONCAT and SUBSTRING. 
Mnese functions can te used with character or binary 
attributes, CCONCAT requires tro string arguments, strics 
all trailing celeanks from roth strings and concatenates the 
second string to the first. The SUBSTRING function requires 
Meet arcing fosition in the string, a length to define the 
number of characters desired, and the attribute on which to 
perform the oreration, For e@n example of SUBSTRING 
employment, see Aprendix A, 
meeoe oystem Supplied Functiocrs 

There are three categories of system supplied 
functions available in RQL. These frovide information about 
the database and host, cross reference of system assignea 
identificatien numbers to asscciated character strings, and 
data type conversions, 

The first group of functicens 1S parameterless and 
provide general information ateut the host and databkase, For 
example a user may request the name of the database 


([DATABASENAME ()J, the time or date (GETTIME () or GETDATE 
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(ie, eneawatteaen@o host (HOST ()], the identity of the CBA 
(DBA ()], or wno is executing a command CuSERID ()). 

The second group of functions is useful in froviding 
Pmeormation in a meaningful form to tne user. There are 
three selfeexplanatory ccmmands in this grceue CRELAID 
(relation name), REL.ANAME (Creletion 10), and FIELDINAME 
(relation IC, attribute I10)). These translations are used 
extensively in Aprendix A. 

The last group provides the capanility to convert 
expressions (exp) from one data type to ancther, For 
example, a user may convert an expression to aie, 2% or 4= 
byte integer [(INT1 (Cexp), INT2 (Cexp) or INT4 (exn)), a 
binary number (BIN (exp)], cr a flcating=foint number (CFLCAT 
Clenqth, exp)]. The expression can be any one of the other 
types listed as well as string and cinary coded decimal tin 


there legal forms (@.9e, compressed and uncompressed). 


De. EXPRESSING THE RELATICNAL CFERATICNS IN THE QUERY 
LANGUAGE 
Tne power of a relational query lanquage is usually 
measured py its ability to ferform the operations srecified 
in relational algepora or relational calculus, Since the 
equivalence of the two has creen demonstrated (Ref. 8], the 
relational algebra will be used for comparative purposes 


without loss cof generality. It should pe noted that RQL is 
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probably best characterized as a domainebased relational 
Capeulu's . 

the rereavtiohalr @io@bra ‘SupEorts four traditional set 
operations (Union, Intersecticn, Uiftference, and Cartesian 
Product) and four special relational operations (Selecticn, 
Projection, Join, and Civisicn). All eignt operations will 
re defined and an examele of each in RQL will ce frovided,. 
In the exameles the term, relationename, will be abbreviated 
to “rel” and the term, attritCute name, te “’attr’. 

i ©6OLLhe@ Selection Oreraticn 

The selection operation provides a subset of tuples 

in a relation whieh satisfy a given qualification. All 
attribute values cf every tucle satisfying the predicate are 
included in the subset, RGL grevides an ALL Keyword which 
simplifies the selection creration cy avoiding the 


enumeration of every attritute in the target list, 


RETRIEVE UNIQUE (rel,ALL) WHERE <qualificatioan> 


Ze The Projection Ceeraticn 


Projection is used te reduce the number of attribute 
values in tne tuples which take ue tne selected surset. In 
addition toe limiting the numcer of attricute values in a 
tuple, the egfbrejection operation also deletes duplicate 
tuples from the sucset, Deleting durlicates can be enforced 
by using the ofticnal keywerd UNIGUE, Projection in RCL is 


@ function of the target list ir tne retrieve command, A 
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qualificaticn may be used to reduce the number of tuples as 
in selection, To reiterate, selection reduces the nunrbper oft 
tuples whereas rfrojection reduces the number of attribute 


values, 


RETRIEVE UNIQUE (rel.attri, rel,attr2, «ee, attrn) 


WHERE <qualification> 


3. The Join Cereraticn 

The join operation may fre crerformed on any number of 
relations whose attrisutes are defined over a cemmon domain, 
The result of the join is a new, higheredegree relaticn, 
Each tuple, in the resultant relation, i1sS formed obpy 
concatenating tuples from the scurce relations whose 
attribute values satisfy the qualification, 

There are different sgualificaticens and therefore 
different joins, The equiejcin is formed over an equality 
predicate, [he inequality jcin is formed over an inequality 
predicate with an operator such aS <, >, <=, >= or !5, The 
following is an example of an eculejoin; the other joins can 
be realized by manipulating the target list (naturai join) 


Or predicate (inequality jecin), accordinaly. 


RETRIEVE UNIGUE (rell.AbL, rel2.ALL) 


WHERE Trell .icinattr = rel2.joinattr 
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4. 


Tne division opveraticn is defined for two 


in which 


degree of the dividend relaticn,. 


2 degree 
relations. 
relations 
decrees of 


Ne The 


attribute values for each tuctle in 


that 


attribute values of the uniquely determined 


(identical 
example if 
abab, and 


Y would te 


resultant 


every tuple in the divisor, 


The Civision Creraticn 


tne divisor 


equal to the diftterence of the degrees 


The division operation is 


gemonstrated 


relaticns 


relaticn has a degree less thar the 


Tne resultant relation has 


of tne two 


using 


reli and relz and dividing rell oy rel2 where the 


reli and 


relaticn consists cf 


the dividend , 


rel2, exists as 


first (men) attribute values) in 


arelation X has tuples abcd, atef, 
relation Y has tuples cd and ef then X 


the relation containing the tuple aos, 


partial 


rel2 are m and nm respectively with m d= 


the first (men) 


Fel, Such 
the last n 
tuples 
read. - Or 
Beed, wana 
divided cy 


Thies Cuele 


ab would exist in the resultant relation Since abcd and aref 


are in X, 


However, 


the tuple oe 


memmot in X, 


RETRIEVE (reli.attri, 


Fedl.attr2, eee 0 


WHERE COUNT (Crel2.attri) = 


would not apeear since 


redl, 


pcet 


attr(m=n)) 


CCUNT (reli.attri by reli.attri, 


reli.,attr?2, 


e@eeée 
WHERE reli.,attr(m=n+4l1) = 


ANC relil.attr(m=n+2) = 


a2 


renl, 


attr(men) 
relZ,attri 


rel2.attr2 





AND eee 


ANC Reibievattrm = rel2.attrn) 


oS. @wneUnion Creration 

Union is the traditional setetheoretic definiticn of 
union with the additional constraint of requiring the two 
relations to te unilonecempatible. Unionecompatiaoility 
Stifpulates that the two relaticns must be of the same deqree 
and the corresponding attrikute values must be téKxen from 
the same domain (e.G-e, reli,attrk and melZ.attrx must be 
defined over the same domain). The union of two unticne 
compatible relations is the set of all tuples belonging to 
Gercm@r relation or both relations. Note that duplicates are 
not automatically @eliminatecd ero je €@ RETRIEVE UNIGUE 
Cunifonerel.ALL) can be executed after the following example 


to display the unicn,. 


RETRIEVE INTO unicnorel (reli.,all) 


WHERE <aqualificaticn> 


Reweevewle Unionerel Cattrl = wel? wattri, 
Stiteeewaner el eeattr2, «cee, attrlast = rél2.attriast) 


WHERE <qualiftication> 


6. The Intersection Crerarion 
Intersection is only defined for unilon-ecompaticrle 
relations, The resultant relation is comerisea of tuples 


We EXLiSt identically in coth cf the relations. 
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RETRIEVE UNIQUE Creli.ail) 


WHERE Pedi,attri = rel2.‘attri 
AND reli.,attr2 = rel2,attr2 
AND eee 
ANC reiil,attrlast = rel2,attriast 


7. The Cartestan=Product Creration 


Given two relations,rell and rel2, of degree m and na 
respectively, the Cartesian product is the set of all tuples 
of degree (m+n) formed by taking the first tusle in rell anda 
concatenating to it all tuples (Cone at a time) in rel2, This 
process is then repeated for the second tuple in reli until 
all tuples in rell have reen concatenated with every tuple 


in rel2. 
RETRIEVE UNIQUE (Creli.all, rel2.all) 


8. The Cifference Creration 

The difference of two unionecompatible relations is 

emoe Set of tuples in thre first relation but not fn the 

second. 
RETRIEVE UNIQUE (Creli,ALL) 
WHERE O = ANY (Creli,attri BY reli,attri 
WHERE pedi. 4a4ctri = rel2.attri 

AND eee 


AND reli.,attrlast = rel2,.,attriast) 
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Poesesquery mequires @ea#eh tuple itn reli tc te 
compared with every tuple in rel2. In the above example, it 
is assumed that reli.attri is the key for the relation, In 
the event a relation has a composite key, tne reli.attri 
fellowing the BY can be replaced py a linear list of 


attributes comprising the key, 
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IV, THE DATAEASE ACMINISTRATOR 


The role of DBA is to establish the database and to 
ensure that the database system is responsive to the user’s 
performance requirements and infermation needs, Although 
the discussion of CBA will use RDM 1100 as the target 
system, the facilities descrited and DAA Support required 
should be applicable to any relational database management 
system, In particular, the amcurnt of DBA Suprfert requirea 
does not depend on a particular system, If tne system does 
not provide certain facilities, CBA will ce required to 
reformat and/or extract the infermation from tne database to 
satisfy the users information needs, Finally, tBA will te 
referred to as an individual; hnewever, the functions can be 
the responsibility of a grouc of feople, 

This section ward 1] discuss the funetions and 
Qualifications of CBA in the areas of database envirenment, 
database design, system services, user services, security, 
and performance enhancement. For eacn area, a generalized 
statement concerning CBA functiens and dqualificatiens will 
memeorovided; then a specific description of the function in 
the ROM 1100 envircenment will fellow. RDM 1100 feature 


Mecn supports it, 
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Ae THE DATABASE SYSTEM ENVIRCNMENT 

A software datatase Management system is designed to 
sucport a single datarase ona generalefurpose computer, 
The advantage of a btackend relational database macnine tis 
the Support which can be proviaced to multiple nests and 
multiple databases, Tne existence cf multiple databases cn 
a Single machine creates two levels of management, Level 
one, the system CEA, is primarily concernea witn macninee 
wide performance and estetlishing authorizations for tne 
database DBAS, Level two, the datacase DeAs, is concerned 
with the operational data in the incgividual databases, CBA 
and system Dea should be Knewledgearle in the areas outlined 
above to ensure efficient anc reliacle database performance, 

In RDM 1100 the system CBA has control cf a datacase 
called the system datakase, Certain commands such. as 
creating and destroying databases can be issued only frem 
the system datatase, when anew database is created the 
individual issuing the CREATE CATA®@ASE command will be CBA 
for that database, In this thesis DEA will refer te the 


leveletwo DEA unless otherwise indicated, 


B. THE DATABASE DESIGN = THE PHYSICAL AND CONCEPTUAL 
SCHEMAS 
AS alluded to above, DBA has numerous areas of concern, 
The second area to be addressed is the database design, 


This topic descrites the design of the physical schema. anda 
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the conceptual schema, A schera is simply a flen for 24 
particular level of tne datatase, The third level, external 
schema, will te addressed in Section IV C, 

The physical schema, also called the internal scnema, is 
a plan for the actual Ssterage cf data on the physical 
devices available to the dataktase, In KEM 1100 each disk is 
divided into zones of 180 2kKetyte clocks, Tne first block 
in each zone is reserved for a directory to the contents cf 
that block, The number of Flocks recquired for a relaticn is 
dependent on the number of tuples and the length cf the 
tuples, Since the ehysical schema is a function of the 
database system, the major issue from tne DBA perspective is 
whether the system allows the lecation of data ana indices 
to be explicitly specified, 

The conceptual schema is the Jjogi¢cal plan normally 
associated with the entire ‘organizaticonal view’ ana 
instituted by CBA. As the chysical schema is comprised of 
the actual location and storage structure cf the entire 
database, the conceptual schema includes the names of all 
rélations, indices, and ¢eta dictionary entries in the 
database, 

The primary query lanquage surset used to define the 
conceptual and ehysical scheras is the data definition 
language. The magceing between the physical and conceptual 
schema is performed py the database system, Inis mapping is 


built as tne objects are mace known to the datarase syster, 
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In RQL the CREATE <ockject> commands are the crimary 
commands employed to specify the conceptual and physical 
schemas, The datatase system will Construct a data 
dictionary for each orject. This ineluaes making the okject 
Known to the system, reserving appropriate storage, and 
describing the aprearance cof the cbject (€,9., number, size, 
and type of attributes). In crder to design the penysical 
and conceptual schemas, CBA must Know the organizational 
structure and must understand database normalization, the 
database system architecture, and the concepts of data 
sharing and ownership, 

1. Oraqanizetional Structure 

Since CEA is respensikle for ensuring that the 
database reflects the ‘real werld’ of the organization it 
suprorts, there is amele justification for a goed working 
Knowledge of the organization. Ihe objective is ‘to develor 
@a@plan which will accurately reflect the organizational 
requirements without a need to continuously redesian the 
Sacweoase®,, Althcugh it is temetine to limit the application 
to one functicnal area like cersennel, CEA must ce aware cf 
tne relationships cetween tre cersonnel and other enticies 
in the organization, without a total oraanizational fsicture 
DBA will ultimately be faced with redesign to meet the 


orcganization’s needs, 
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2. Normalization 

In order te enhance datarase reliability and reduce 
redundancy, a solid foundation in relational datatase desian 
princifples is required. Cne extremely imgortant aspect Ils 
the ODBA’s understanding of normalization. Cnce a sgecific 
normal form is established fcr a database, LBA must realize 
the possitle implications of deviations from a this normal 
form and should document the excertions. Normal forms are 
net specifically discussed in this tnesis and the reader is 
directed to (Ref, 7] for more infermation, 

The ROM 1100 system, like mest existing relational 
systems, requires only tmat all relations be in first normal 
form. This normal form stipulates that each attribute value 
in a relation must be ateric. That is, tne value {is not 
decomposable, Pumener, there: ~is only a single«value 
selected from the specific domain for an attribute, Higher 
normal forms frust ce enforced ey CBA, 

Ss. Oatabase System Arcnritecture 

DEA must also understand the architecture of the 
database system to expleit efficiencies or avela 
Geficiencies, Since database users do nct have static 
applications and the data stored is also aqynamic, DBA must 
Knew how te meniter and enhance cerftormanee, if ecossicle, 
when user requirements can no longer be setisfied, 


requirements, 


40 





4. Data Sharing and Cwrersrif 


One of the erimary reascns for employing a database 
management system is to snare the data among users, This 
provides a reduction in the sterace of redundant data and 
A@lleviates the scossibility cf anomalies associated with 
redundancy. The concept of sharirg cata must pe temrerea 
with the recuirements cf user needs and information 
security. Therefore, a means must ce avallanle to rfrovide 
control over the data and to permit tne controlling 
authority to decide who will Rave access to the data he 
controls, 

ineenion ©1100 "ene eontrol of gate is a function of 
ownership and access rignts, The creator of an object fis the 
owner of that object, Cejects wnich may be owned are 
databases, relations, views and stored commands, The owner 
eof the object must explicitly permit other users (less DEA) 
to 6©access) 6€6Uthe)«€6uccbject)6hC6or€6 6rortions:)6«€(of)6 «6an) 6Uobject (4.9., 
specific relation attribute values). For a more detailed 
Giscussion of ownersnip and access rignts, see Section IV C, 

S. Recommendations 

In database desian the first step its to develop 4 
strategy to meet ree organizational information 
requirements, since the ecencectual schema 1s the 
comprenensive data description of the organizational 
information structure, the seccnd step would entail the 


designing of the conceptual schema. By using this afproacn, 
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data independence can be maintained which will erevent the 
modification of aprlications cerograms due to changes in the 
physical database, Further, the dependencies between the 
conceptual schema and user requirements can be documented to 
ensure changes at the conceptual level will not result in 
changes to the aprlicaticns precrams. 

te Soule ce noted thet CEA must control the 
creation of relations and indices in such a manner so that a 
specific normal form can be maintained. iA “deertion to 
conforming to the inmposedeneormal form constraints, each user 
creating relations to satisfy his cwn needs rust not violate 
the relations of the dataraSe suprortina otner users, Such 
violation will certainly ccntain excessive and redundant 
information, and undermine the initial database design, 
Additionally, if individuals sharing 2 database ere 
permitted to create objects at will, the sharing of cata py 
all users may be subverted and the database could raridly 


deteriorate te a useredetermined tiling system, 


C. THE CATABASE CESIGN = THE EXTERNAL SCHEMA 

As important as the physical and concertual schemas are 
to the imelementation afc a single database, the 
establishment cf the external senema is critical to the 
users, In considering divergent user application 
requirements, the external semema provides the means to 


define precisely what will satisfy eacn users information 
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needs, The external schema cf the database is different for 
each user or group of users. These scnemas are compcsed of 
subsets of the conceptual schema, Tne definition of the 
contents of a particular externa] scnema {is normally 
accomplished through access centrel of objects existing at 
the conceptual level, Sy restricting the relations, 
attributes, stored commands, and/or views available to a 
user, a subset of the entire datarase is cefined,. 

A user "s access to the datakase 1S determined by the 
user’s access rights. The access rights of a user are 
authorized by CBA and consequently DEA controls user access 
to the database, ingpadetction to. the verification and 
matching of host ID and hesteuser ID to the database 
systemeuser IC, these access rights are the only means for 
access control in the majority cf aqatabase systems, In RGL 
the PERMIT and CENY commands on ehysical objects, virtual 
eorjects, and stored commands carn te used to establish the 
various external scnemas of the database, 

1. Permit/Ceny Access 

There are two access rights which must be avallacle 
in a database system to previde a user with the appropriate 
level of information. These access riaqnts are read and 
write, Execute privileges can be considered a special case 
of indirect reac/write just as create can be a special case 


Of write, 
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Tne two commands in ROL which assign the access 
richts are FERMIT and CDENY, The PERMIT command grants a 
user a specified access right over an ooject or commana anc 
the CENY command revokes cr removes such access rights. 
DENY is primarily employed to revoke a previously grantea 
PERMIT, 

The access rignts availatle {nm RGL are READ, WRITE, 
EXECUTE, and CREATE. PERMIT READ provides access tc tne 
specified objects (relation, view or named attripnutes of a 
relation or view). To modify or add data to existing 
relations in the database a PERMIT aRITE for the user or 
group of users on the objects cr rortions of objects must te 
explicitiy autnerized., The Keyword ALL can be used to grant 
read, write, and execute privileges to a user or group of 
users. Only the cwner of tre crject is autnorized to CENY 
access to the orject, 

Tnere are two cases of implicit access in ROM 1100, 
DBA is autnorized access t¢ all okcjects in the database to 
which he nas not reen exrlicitly denied access cy the owner, 
Even if access is denied to DBA ry tne owner, OBA may still 
destroy the object by deleting all references to it in tne 
database relations (noneuser). Additionally, the owner of 
any object is permitted access to tnat orject, All other 
accesses must be authorized cy the owner of the object. 


This {Ls the essence of the access ceontrol system, 
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2e Create Physical Objects 


The database manacerent system must rrovide 
facilities to create physical objects in tne datatase, 
Initially, the database must re created with an assigned 
DBA, Following this, the physical objects in tne database 
must be created. Although an index is not a physical object 
which may be manipulated ey the user, it is discussed in 
Emas topic, 

In RG@L tne right to execute the CREATE DATABASE 
command must be explicitiy granted by the system DEA, Once 
a user has authorization to create a datacase, the execution 
of the CREATE CATABASE command makes him CBA of the naread 
database, Te add new users to the database, CSA employs the 
database administrator utilities (CUsAU) program, The DBAU 
NEW.ZUSERS command assigns the nesteuser IL and host ID aned 
places tnem in the HCOST.AUSERS relation. The DESTRCY 
DATABASE command can only be executeac by ODBA, (ing@,, the 
owner of the database), 

CREATE MCOC }e@cts> ii6é als¢ controlled through the 
PERMIT and CENY commands, The permission is similar to 
CREATE DATABASE in that only CBA for a particular database 
may authorize the creation cf objects, Relations and 
indices are the physi¢al objects whicn a user may be 
authorized te create, Cnly the owner of a relation is 
allowed to create an index cn the relatien using tne CREATE 


INDEX command, However, the CBA must authorize the owner 
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use of the CREATE INDEX conmand, Fach Of the abypeve 
discussed commands has a counterpart for revocation, Only 
the owner or CBA may destroy relations and indices. 

Se Create Virtual Objects 

GCnee the ehysical cejects are created, it rs 
necessary to create the virtual objects in the concertual 
schema wnich will define the external schema for each user, 
Views are the virtual objects which a user may ce authorized 
to create, 

CREATE VIEW requires the user to nave access to the 
relations over which the view is defined. Only the owner of 
the view may destroy tne view with the DESTROY <opnject> 
command, 


4. Access Via Stored Ccmrands 





Finally, an indirect read/write (execute) access is 
mecessary to allew users to extract information from the 
database through the use of stcred commancs, Stored command 
is an  RGL term for a user defined function or frocedure, 
Although this feature May net be availacle in every database 
System, it is very useful and fowerful when cerevided. In 
addition to tne efficiency issue of Stored commands 
discussed previously, it is muecen easier for the user to 
execute stored commands than to input long queries. In FQL 
PERMIT EXECUTE allows a specified user or group of users to 


execute stored commands. 
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5S. Recommendations 

There are three methods fer providing an external 
schema to a user or ogrcugp cf users, Ihe first method tis 
througn restriction of access on the physical objects in the 
database, Tne second metncd is to define virtual relaticns 
whieh consist only of the necessary Subset of data the user 
is required to access, Finally, the third metnod entails 
the extraction of infcrmaticn from the database threugn the 
exclusive use of stored commands, 

In RQL the major preblems with the first two methods 
are the addition and deletion cf data and impolementaticn oft 
Atue AS mentioned in Section I1I there are too meny 
restrictions on the use of views for updating datatase, 
Additional problems can arise using the first method as a 
result of the system assianing default values to attributes 
welch =: not explicitly listed in an APPEND command, For 
example, an insertion of a tuple witn 42 blank Key fleld 
(employee number) for a new ercloyee’s salary and name would 
result ina tuple containirg the emrloyee’s name and salary 
with a blank Key field. A separate insertion containing the 
employee numcer and name would result in two tuples in the 
same relation for a single empleyee, 

The stored command can be executed without granting 
the user access rights to the relation(s) which are accessea 
by the commana, However, exclusive use of stored commands 


£or information retrieval is noc reasonable since 
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anticipation of every query which a user could cfossirly 
require is not possible, 

Tnere is a trade-off between access Gomtnoel , 
performance, and relationel eferspective, Each of these 
issues requires the sacrifice of one of tne trade-=cff 
features, In order to resclve this protlem, a combination 
ef tne prescriped methods is required, The use of stored 
commands to input and delete data in a well structured 
database removes tne restricticns on tne use of views, 
Purther, Stored commands can forcee the entry of all 
mandatory attribute values for a tuple through parametere 
argument matching which eliminates the duplicate tuple 
preplem described above, Comkeining stored commands for 
updating with the use of views to detine tne external 
schemas would provide the rost logical] approach, in order 
to employ this strategy a major system change is recuired in 
the implementation of ALL. 

First, it should be crvious that the most logical 
mechanism for reducing an external schema is the view, 
However, the major eprotler is the necessity to frovide 
access to the ATTRIBUTE relaticn to permit the use of ALL 
with the view name, Therefcre, ALL snould be implemented 
auenm that only the attrisutes or rélations the users are 
authorized to access are returned. Tnis should not carry an 
implicit access to the ATTRIBUTE relation. Access to the 


Mere BUTE relation can be restricted by implicit use ot 


48 





usereid predicates on all queries on data dictionary 
relations. The performance issue results from the 
implementation of ALL in the host and the resultant 
communications between tne host and tne backend to frocess a 
query containing ALL. This performance degradation can ce 
rectified by implementing ALL in the ecackend relational 


database machine, 


De SYSTEM SERVICES 
The third area is the services frovided to CBA py the 
system, DBA will use these services to facilitate system 
backup, crash recovery and sfrovide information about the 
database, The system services establish a nucleus of 
information and facilities which DeA may be required to 
augment for his own personal preferences and needs, 
1. System Rackue 
Two areas of system Fackup must be provided to CBA 
fo ensure proper system functiening. Thre first area is the 
necessity of providing a means to record the contents of the 
database when it is in a consistent state. Tnis is employec 
most frequently by the system CEA and is addressed furtner 
meee Next topic on crash recevery, 
The second area is the neeq to return the database 
Eomee previous consistent state as a result of aborting a 
transaction. A transaction is a single command or a series 


of commands which must pe left uncommittea until the final 
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connmana has £inished, This situation can result from a user 
feewSion to @rort Nis transecticn erier to completicn or the 
necesSity of relling a transactiecn peack as a result of 
deadlock, Ceadlock occurs ina multiple user system when 
one user holds a resource (@€.9., relations) another user 
requires and the second user holds a resource tne first 
requires, In this situation, the system is said to be 
deadlocked since neither user can complete his transacticn, 
To resolve deadlock only one cf the transacticns must ce 
momerea back, TIheSGlution te us®r aborts is to restore the 
database to the state it was in prior to the abort. 

Tn RDM 1100 the functicn cf backina up transactions 
is invisible to DBA. The TRANSACT relation (to be discussed 
later) is used to maintain the tefore ana after attrinute 
Values affected by the transaction for relations created 
with the loqging cgetion. The @ATCH relation is used for the 
other relations, A transaction is by default a single 
command unless the explicit cemmands BEGIN cefore and END 
ieeenoACTICN after 4 group of ceonmands is specifieac, ABCR&T 
TRANSACTION can then be issued after BEGIN and before END to 
eemse YyoOllback, ROM 1100 emelcys an optimistic concurrency 
control algorithm which dees net prevent deadlock from 
meeurring. The resolution of deaclock is completely 


invisible te the user and CBA, 
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2. Crasnm Rececvery 


Anotner facility which wust te previded to CBA is 
the abllity to recover frem a system malfunction, TIhnis is 
particularly important when the data on the disk has creen 
lost or contaminated. To avoid excesSsive time celays, 
periodic copies of the entire database must ce mace to 
reduce the amount of updating. The frequency of corying the 
database is dictated ty the number of cnanges in a feriocd of 
time and the time demands of the acglications programs, Tne 
mormal method of recovery requires the most recent cosy of 
the database and the transacticns whien nave occurred since 
the cory was made. Once the cery of the database is loaded, 
Peemcransactions ar@® rerun to gEring the datacase up to date, 
eeemee tne ehrenclogical list cf transacticns is the key te 
recovery, it must ce cocied frem the database on a frequent 
basis even though the copying cf the entire database may te 
less freauent due to the time required, Cf course, seme 
transaction whieh were in pregress or not in a transacticn 
list must te reinitiated Evivcive USE, 

RDM 1100 ocrovides CUMF CATABASE and LCAD CATABASE 
commands In the DEAU facility, Adaiticnally, OU*P 
TRANSACTION is previded te make cepsies of the transaction 
1Ode The command which allews rerunning transactions after 


a LOAD DATABASE cemmand has ceen executed iS ROLLFOCRWARD, 
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3. System Information 


Tne database system ercloyed must efrovide a data 
dictionary and statistical infcrmaricn on the database 
confiaquration and performance, A data dictionary contains 
descriptive information about the database, It must include 
all the various schemas (physical, conceptual, external) and 
should include cross«refererce information stucn as which 
programs use whet data and synenyns, 

In RDM 1100 there are 13 systemesurplied database 
perme ions which contain descerittive inftermation about the 
associated datarase, In addition, there are seven system 
relations which provide a glotal description of the database 
machine, 

The system relaticns provide a catalog cof the 
databases in PEM 1100, a list cf disks Known to the syster, 
status and types cf locks in the System (used for concurrent 
processing), and the configuration of tne communicaticns 
interface to the attached nest(s). Another system relaticn 
provides Picounatltom ., comeerning tne! activity currently 
taking place in the databrase. Tue additional relations are 
used to provide performance data, 

Perhaps more important for CBA are tne 13 relations 
associated with each datérase. Each reiacion is ilis¢ed 
below and a brief descripticn of tne type of information 


Gomtammead is provided, The first 11 are used to surprly data 
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dictionary information and the last two provide information 


related to transaction management. 


RELATICN NAME 


RELATICN 


ATTRIBUTE 


INDICES 


CESCRIFTICN 


A Single tuple is provided for each 
object in the datacase, This tuple 
includes, as agprropriate, the naine of 
thé obyect, owner, relation identie 
fication numser, size, location, 
number cf tuples and their lenath, 
type of ceject Cuser, system, trans= 
BSceion lea, flies view or stored 
command), and the number of 


attributes, 


A tuple is entered fer every attricute 
in tne database, This tuple includes 
the attribute identification number, 
data tyce, maximum length, assccliatecd 


relaticn IC, and attribute name, 


Each index has a tuple in the rela= 
tion, Ihe attributes include the 
index identification nurbper, relation 
ID, number of attributes in the index, 


loGaticn, wand attricute ID(s). 
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PROTECT 


QUERY 


CROSSREF 


USERS 


HOST.USERS 


BLOCKALLCC 


DISK.USAGE 


DESCRIPTIONS 


Contains infecrmation associatedq with 
the exclicit access authorized on ocrtjects 


for users in tne database, 


Contains the storeaq commanads and 


views. 


Cescritres the dependencies among 
relaticens, indices and stored 

commands in the datacase, The derenc= 
encies are system defined and not user 


specified, 


Cescricres the maprings bretween user 
identificaticn numbers, names, and 


uséer QGrcuers. 


Cetines tne MWapring petween the nost 


ID, nosteuser ID, and ROM 1100 user ID, 


Catalogs the sector assignment within 
azone, Each tuple represents a sectcr 


ang the assiaqned ovject, 


CDescrices datatase disk allocation. 


Contairs userespecified, textual 
descrigeticns of objects and attripnutes 


in the database, 
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BATCH Contains temporary logging informaticn 
used Ey the system for transaction 
managerent. This relation provides 
information en transactions acainst 
logged and nonelogged relations so 
they can be canceled if required. 

The transaction information is held 


until tne transaction is committed, 


TRANSACT Fermanent logging intormation used 


for crash recovery. 


All cf the above relaticns fbrovide tne comprenensive 
picture of the database. Althcugh the information is in the 
memacions, much of it is rot in @ usable format, Jil 
example, only the RELATICN relation contains the textual 
neme of a relation. Cther relations use the internally 
assigned relation ID. Further, some of the information is 
Preoged, In crder to translate this information into an 
understandaple format DBA Must develog stored commands 
(preferable to ad nec queries), Doe numoer of stored 
commands will te decendent on the desires of CBA, However, 
@ minimum subset shoulao include commanas to list the 
relations, attritutes, indices, attributes in an index, 
access list associated with an otject, description of an 


object, and aerendencies,. 
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The following stored ccmmands are used to yield the 
minimum subset, TABLES is used te provide a list of otjects 
by type (CBAesurpplied parameter te TABLES), For relations, 
relation name, tyre, and the number of attributes and tuples 
in the relation are provided. FIELCS prevides the relation 
mame (parameter specified by CBA to FIELDS), attribute nare, 
Gmear ctype of the attribute (Cin, char, int, etc.), and the 
length of the attribute for every attribute in the relation, 
ALL.INDICES provides a list cf all indices on user relations 
in the datatase, The infecrmration provided includes the 
relation name, index {dentification number, numerer of 
attributes in the index, and a nerrative description of the 
type of index, An additional cemmand, IMCEX.LIST, iS used 
bOomeprovide the same infcrmraticn as ALLLINDICES @xcect a 
relation/view name is passed as a farameter and only the 
tmadices on that relaticn/view aré returned. ATTIINWINEX! 
and ATT.IN.INDX2 are used te list tne attributes in an index 
by name, These commands recuire two parameters; the index 
ID and the relation name, Ihe reason for tne development cf 
two separate commands is the readanility of the outrut. 
PRCTECTION provides the orject name, user name, and type of 
access authorized fer an ctject whieh is spassea as a 
Parameter, Another command, ACCESS.LIST, is provided to 
describe an otject and tne associated access list for a 
particular object. WHATIS crevides a narrative explanation 


Of its parameter from the CESCRIFTICN relation, DEPENDS is 
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used to provide a list of the derendencies on an object. 
Finally, another useful command is aHOCREATES which crovices 
a list of uSers who have ceen granted create permission in 
the database, ROL constructs fcr tne stored commands 
described akove are provided in Arpendix A, 

4. Iranslator 

Upon implementation cof a relational database, it 
will be necessary to load the data into the system. Since 
the data exists on some storage device (disk, tacve, etc,), 
there should be a mechanism fer presenting the data to the 
system for immediate loading in a relational format, 

In ROM 1100, assuming the data can be collated as a 
sequence of records on a disk cr tape, tne ’translator’ can 
then be used to load the datakase on a relation=by=relation 
basis. The “‘’translator’ will ask a series of questions to 
ascertain the inecming data fermat and estarlish the 
relation schema. The following questions must be answered 


for a relation, The answers are parenthesized, 


me NOUCEUt directly to the REM? (y/n) 

2. Ineut flle (name) 

3. Catabase (name) 

4. Name cf table (relaticn name) 

Bea Name of fst tleid (name of first attribute) 

6. Enter ineut tyce and length (Cinput file format) 


moermcen Ooutcut tyce and Te@roth (cel2, ii, etc.) 





So wetarcine cestttion Cingeut fille) 
(Guestions 5 through 8 are repeated feo eacn 
attribute, ) 


9, Record lenatn Cineut file) 


E., USER SERVICES 

The fourth area is DEA Supeert provided to the users of 
the relation database system. OBA should previde 
services/facilities to the users cf tne database depending 
on their applications and experience level, aA discussion of 
user services in two general areas will be addressed. tInese 
areas are providing a help facility and froviding stored 
commands, 

li. Help Facility 

As with any interactive system, a nelp facility tis 
required to preclude timeecensunming, trialeance-errer 
corrective precedures. For a relational database system the 
help facility sheuld inelude, at a minimum, the syntax and 
explanation of every language cemmand and an explanation of 
the stored commands, relaticns, and views, 

In RCL this can ce accomplished by creating a help 
relation with three attritutes (orject, line number, ana 
text) and defining a stored command wnich given an object as 
@ parameter will explain its furpose or use, The storea 
command must contain @aprropriate creaicates in tne WHERE 


Clause to ensure the user can cnly retrieve information frem 
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the help relation about objects which he is authorized 
access. An example of the Nele relation and stored commana 
is provided in Appendix A, 

2e Stored Commands Provided Ey DBA 

DEA must nave an inedepth Knowledge ot the query 
language. It is not reasonable to asSsume that the average 
user will become proficient in tne use cf the query 
language, Both cuery language complexity and performance 
issues must be considered, Ihe examples in Sections III and 
Vdemonstrate some of the conmclexities in RGL. CEA will ce 
required to assist the user in the proper formulation of 
some gueriles, In addition, the users will look for 
assistance when confrented with any cerceived croblem in the 
database, Since CBA is a database expert, the user will 
naturally request nis assistance. 

In addition to applications oriented RQL_ stored 
commands, which are not discussed, DEA should provice 
commands similar to those descrired earlier in this section 
fone the user. Specifically, CEPENCDCS, wHATIS, PRCTECTIICN, 
AITSINJINDX1, ATTSZINJINDXAZ, INCEXJLIST, FIELCS, TA@LES, anda 
HELP should ce gprovided. The enly difterence retween the 
DBA commands and the user’s sStered commands is the inclusion 
of the necessary credicates in the wHERE clause to limit the 
response to data which the user has been granted access. 
Qther minor modifications may alse be desired, For example, 


TABLES could be farameterless ard return all relations, 


=, 





views, and stored commands to which the user has access, 
PROTECTION could ke modified to return only the accesses ¢n 


objects the user owns, 


F. SECURITY 

The fifth area for DBA concern is the security of the 
database, The security of a datatase system is plagued with 
the same problems associated with computer security in 
general, Tne normal mechanism for security is access 
control. Since a database system is attacned (backend) tc a 
host, tne security measures srovided cy the host are the 
first level of security afferded the gatacase systen, The 
user IDepassword legon procedure employed sy general-purocse 
corputers can be used for database systems to provide the 
same security checks. Additicnally, a host ID check in 
conjunction with the previously mentioned validation can be 
cerformed when a backend system is used, Security is also 
afforded by the cackend machine conficuration since the 
database machine is separate from the nest and uses its own 
disks for data storage, 

The tirst security check perfermed cy RDM 1100 is the 
verification of the hnost and hesteuser ID. These IDS are 
verified each time a request is made from tne host to the 
backend macnine, Since the security of the database is 
closely associated with the security of the host, the use of 


Passwords on the Rost for identification and verification is 
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essential. Ine user IDepasswerd logon precedure, is not 
employed in ROM 1100 But is taken from the nost which means 
there is not an additional ICepeassworg reculired for the 
backend machine. In addition to the veriftication and 
meccning of host ID and hesteuser ID te the database 
systemeuser ICD, the access control rights are the cnly 
security mechanisms avallacle in FDM 1100, 

There are two implicit access rignts in RDM 1100, The 
owner (creator) cof any object and DEA are fermitted access 
to that object unless explicitly denled by tne owner, All 
other accesses must be autherizead by tne owner of the 
object. This is the essence cf the security system, The 
remainder of this topic will diScussS specific Security 
weaknesses in the RDM 1100. 

1. Security Aspects cf “*ALL’ 

A crucial aspect for security is tne implementation 
of ALL, ALL is used in a GUuery aS a Synonym for every 
attritute of the relation in the target list. AS previously 
discussed, there is not a user IC qualiticaticn associated 
with ALL. Therefore, the trarslation of ALL to its 
attribute equivalents is ctased on tne object (relation or 
view). ALL does not work with a view or a relation unless 
the user has FREAD access cen the AITRIBUTE relation. 
However, once this access is authorized, the user can 
examine the entire conceptual schema which is certainly a 


Viowation of security. 
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2e System Messages 


The use Ot relation.attribute(s) or 
view.,attribute(s) in the target list returns two secarate 
error messages if read access to the object is not 
permitted, One error message (permission denied on ..-e) 
indicates the attritute name is valid hut access is not 
authorized, The other errer message (eee net found) can te 
intercreted as the attribute mare is non=wexistent, Although 
extremely tedious, the errer messages can be used in a 
traileand-eerror method to ottain the conceptual schema, 

3. User Identification Nurcrers 

Another serious weakness in the security of ROL Is 
the deletion of a user from a database, The easiest method 
is to delete the user from the HOSTLUSERS relation whicn 
will prohibit him from orening the database, However, if a 
new user is added to the datarase from DBAU and the system 
assigns him the UID which was previously assigned to a 
deleted user, the new user willl inherit all the accesses 
which were estarlisned by CEA and owners for that UID. This 
is not acceptable since there should not ke any implicit 
authorizations for a new user, 

4. Recommendations 

The recommendaticns ECT COnn@eting the 
implementation of ALL are discussed in Section IV CS above, 
Although not as informative, the return of a single error 


message for beth access denied and relation,attribute not 


Cs 
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found would provide less information about the conceptual 
schema of the database. Frem a user’s perspective it does 
mot appear to be significant whether access is denied or the 
object is not in the database, The critical issue is to 
avoid divulgence of the conceptual schema to a user not 
autnorized this information, 

The two methods for cecrrecting tne user ID problem 
are the explicit deletion cof all access rights in the 
database (PRCTECT, USERS, HKCST.USEKS) for tne old user ety 
DBA, or providing a command in the DBAU to delete a user 
from a specified database which willl explicitly remove all 
the accesses the user has been Granted. The second metnod is 
preferable to the first since the system should orovide tnis 


service to CUBA, 


Ge, FINE*TUNING PERFORMANCE 

The last area of concern fer CBA is the cgerformance 
enhancement of an existing database, Given that a 
relational datatrase system has already been selected and the 
overall performance factors have peen estartlished, CBA 
nevertheless does have a few tocls at his discosal whlen can 
enhance rperfermance,. There are features in tne auery 
language implementation which are more efficient than 
others, For example, a join can run faster depending on 
which relation is held in cache, Cne lancuage uses the last 


relation listed in the query if other factors are equal, 
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Thus, the order cf the relations could be important. 
Another example is the use of parenthesis to resolve 
ambiguity in a list of logical predicates, These features 
are highly imclementaticnederpendent and will not be 
adéressed further, The ether three features are data 
reorganization, indices, and data placement, 

In RQGL DEA will be reculred tec develop ae serformance 
monitoring strateqy which may include tne perlodic execution 
of stored commands specifically designed to Coles t 
performance data, 

1, Data Reorganization 

As data is added to and deleted from the datakase, 
there is an associated fragmentation ot relations tin 
enysical storage, Even though many catabase systems crovide 
the capabllity to reserve extra space tor relations, this 
will result only in a delay of fragmentation. The extent of 
fragmentation must be monitcred and fragmentation eliminated 
when necessary, 

CBA may scecify the number of blocks for a datapnese 
and for a relation in ROL, Additionally, FILLFACTORS can te 
specified for clustered indices on relations. This 
FILLFACTOR determines the cercentage of each disk block 
which will be used for the data in the relation when a 
clustered index is created, when the fraamentation becomes 
excesSive, the clustered index can ce destroyed, recreated, 


and a new FILLFACTOR assigned, TInis procedure will resort 
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the data in the eElocks available for the relation. A 
relation will be allowed to aqrow until it uses all the 
coc@mseit is gutherized or all Elecks in the database are 
full. Since clocks are not resused when data is deleted 
from a relation, this will result in reacning maximum celcock 
caracity and fragmentation. CBA can moniter this activity 
by writing a stored command on the sS8LOCKS') relation. The 
ability ee) eliminate fragmentation for a noneindexeo 
relation will depend on the numerer of free consecutive 
blocks available in the datacase,. If enougn blocks are 
available, the data can ce retrieved into a temporary 
relation defined over thre emrety blocks, tne original 
relation destroyed, and the temporary relation renamed, 
This strategy can also be emelecyed wnen reclusterina dces 
not offer a satisfactory seluticn to fragmentation, 
2e Indices 

Indices can enhance the ferformance of a database 
for data retrieval, (Ref. 2) and (Ref. 3] nave documented 
the actual ennancement in FCM 1100, Since indices are 
applicationeorilented, they are highly desirapnle for 
@atabases where the majority cf orerations ere retrieval of 
data over large relations or relations wnich are fairly 
static can be identified, If numerous update and aprend 
transactions are envisicned, then a degracation in 
performance could result due te the constant updating of the 


indices, Therefore, DBA sheuld te aware of the size of the 
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relations and tyres of cperations performed on the 
relations, For example, if insertion iS prevalent then 
avoidance of indices ¢cn the relations which reguire numercus 
APPENDS, if possitle, may reduce degradation, 
3. Data Placement 

Hypothetically, the clacerment of data on disks can 
enhance performance, For example, i1f a join retween two 
relations is ertormed frequently, then placing the 
relations on separate disks will reduce disk nead mevement 
as data is moved into cache. Althougn this hnyrcothesis has 
gee. been verified due to the lack of facilities tor flacing 
data in ROL, the data flacement§ strategy should ce 
considered when explicit assignment ot physical storage is 
avallable, Tnis could Fe even more significant when 
processing data onethe-fly is realizea, considering the 
Speed discrepancy cretween reading data and moving disk 


heads, 
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Vee VALUATION OF SIRE RELATIONAL SYSTEM 


A. THE FULLY RELATIONAL SYSTEM 

We tneweully Relational Characteristics 

The @efinition of a "fully relational" database 
management syster is aiven ey Cnris Date [rRef, 7). Date 
suggests that most existing systems are not PUL LY 
relational. Tne primary cernefit of considering ftully 
relational as a standard and geal fer imrlementaticen is in 
the algqetraic pcecver of the lanquage and tne censistency of 
Syecem Supplied functions. I1f the system is déficient in 
any characteristics which Cate descrines, aperorriate action 
may be taxen to provide a semrlance of a ftully relational 
system, Fisast, the concect cf fully relational is defined; 
then a comparisecn of RDM 1100 and RGL to the definition tis 
adoressed, 
In order for a datatase te ce cnaracterized as fully 

relational it must sSupcort tne following: 

a. "relaticnal databases (including the concecpts of 

domain and key and the two irtecrity rules);" 

Ob. “a language that is as fpowertyul as the relational 

algebra (and that «sculd remain So even {if 411 

PIciieeLeses f£¢cr loops and recursion were ee be 


deleted)." [{ref, 7] 
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A relational datatrase exhibits the Solow ins 


preperties: 


@. Relations are in first nermal form, 

b. Asseciaticns cetween relations 4re explicitly 
connected threugh commen attributes, 

Cc. Every value appearing in agqgiven attrirute {fs taken 
from the domain for thet attribute, 

Gd, Every relation mas a unique erimary key whnicn 


distinguishes (identifies) individual tuples. 


In addition to the acove froperties, two integrity 
rules are recuired., First, a null value is not permissicrle 
as an attribute value of a primary key, second, if a 
relation A mas an attribute value whicen is also the primary 
key of another relation &, tnen at all times tne attribute 
values in relation A must exist in 8&8. This rule prevents the 
missing linkages among relatiecns when attribute values are 
added to relation A or removed frem relation 8, 

we mourn waré@as! of Deficiency 

There are four areas in which RDM 1100 does not meet 
the reguirements for a fully relational system, First, 
altnough specification cf the scnema includes data types for 
each attribute, no notion ef an underlying domain is 
incorporated, Since attributes are defined cry general 
length and type comrarisons ef attributes are limited only 


to similar types Ce.G., character with ecnaracter), 
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meaningless comparisons are allcwed, aithout the concects of 
sets, enumerated tyres, and ranges availacle in niahner order 
languages such as FASCAL or ALA, the support of domains will 
@lways pe questionatle, 

Second, the requirement fer a unique primary Key is 
fee engorced, The uniqueness of an attribute value can be 
enforced by declaring a unique index on cne of the candidate 
keys, However, this assecilates an access fath with the 
concept of a key, These are twe logically separate issues 
and as such should be dealt witn separately, since the 
existence of a candidate key does not imply the need for an 
access path on that attritute, 

Third, nullS are net implemented in ROEM 1100, 
However, the Gefault values for integers (zercec) and 
cheracters (planks) are proviced for unspecified attribute 
values, Tuple(s) may te entered into a relation without 
values for tne Key flelds, Even J1£ unique attribute values 
are enforced through index specification, at least one tuple 
with the default value in the key attribute will be 
accepted, 

Finally, relations are nermally connected tnrougn 
the repetition of some (or all) of the key attritute values 
in one relation A and in ancther relation B. However, there 
is no mechanism to ensure relation & does not contain a 
value in the connecting attribute which does not exist In 


relation A, 
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3. The Relational Coroleteness 


RDM 1100 ferforms all the relational alcerra 
operations defined in Section III with one exception. This 
exception deals with the elimination of duplicate tuples in 
the results after applying certain operators (projecticn, 
division, natural join, etc.). Fer example, althougn the 
result relation may arpear te satisfy 4 natural join, it is 
obvicus that duplicates are net a priori eliminated, since 
epes @€limination is a functicn ef the associated crojection. 
Additionally, a ecrojection cf an attribute in a relation 
with duplicate entries will return all tne values in the 
attribute without regard to duplicates, A join ccula oce 
simulated by forming a Cartesian efreduct of the twe 
relations, agelying the credicates to the product, 
extracting the concatenated tuples which satisfy the 
predicates, and projecting the attributes from the target 


list, 


B. COMPARISON CF TWO QUERY LANGUAGES 
This topic gerovides a cemearison of ROL and SGL,. The 
selection of SGL as the comparative lanquage is based on the 
relative familiarity of a large number of pecple with the 
language and its widespread use, 
1. qual Power 
The power cf the two query languages is cpractically 


identical, Eotnh languages are relationally complete which 
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implies; 
@e Any relation derivatle from the database relaticns 
using an expression in the relational algebra can te 
retrieved using the language, and 
bE. Any derivable relation can pe retrieved using a4 
Single statement in the language, 
2. Differences in the Syntax Structure 
The major difference between SQL and RQL is the 
Syneeactic structure. Using the database in Figure 1 from 
Date, an example of the two query languages will te given, 
This example {is a query te list the names of all 
suppliers who do not provide part "P2", As can seen from 
inspection of Figure 1 the answer would be one supplier, 
ADAMS. | 
SQL; SELECT SNAME 
FROM S 
WHERE °P2° $s ALL 


(SELECT PNR 


FRCM SP 
The query as stated in RQL is3 
RQGL3 RETRIEVE (S.SNAME) WHERE C = ANY (SP,P.LNR BY 
S.SNAME 
WHERE SP.PINR = "P2" 


ANC §.S.NK = SP,.S.NR) GO 
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SNAME 


SMITH 
JCNES 
BLAKE 
CLARK 
ALAMS 


PNAME 


NUT 
BCLT 
SCREW 
SCREw 
CAM 
CCG 





EPaur eet « 


CEEC R 


RED 


GREEN 


BLUE 
REC 
BLUE 
REC 


OTY 


300 
200 
400 


1G¢ 
100 
ICC 
400 
200 
2006 
30C 
460 


Cu Y 


LONDCN 


PARIS 
PARIS 


LONDON 
ATHENS 


WEIGHT 


12 
17 
i 
14 
12 
19 


City 


LONDCN 
Peale 
ROME 
LONCOCN 
PARIS 
LONCCN 


The SupcliereParts Catatase 





Without regard to imelementation the above querfes 
are resolved as follows: 

@. In the SGL example the sets cf Suppliers and the 
parts they surply is fermed cy the neSted select. Then 
the cuter select will return 4 Supplier’s name, if ano 
only if the set of parts suprlied py tnat sucplier dces 
Mot contain “PZ", 

bE. In the RGL example the "Ey" clause establishes the 
Same set as the inner select cf che SQL query. Then the 
two boolean expressions are evaluated with the "anc" 
Seonyumetlon, 1 NO tuples "setisty tne ®conaitions for a 
Given supelier, then the value of ANY (turle) {is 0. If 
ANY is Q, the qualificaticen evaluates to true, and the 
suppliers name is returned. S.S.NK = SF,.S.NR insures 
Erace SUPeLTers ain the S relation but not in Ene SP 
relation are net ignored (i.e€., that a surplier who 
muperles NO partS "will be included as a tuple in the 
answer to the guery). 

Tne syntactic structure of the example demonstrates 
the major differences in the twe languages. SQL is highly 
Structured, with nested selects. On tne other hand, kGL 
Goee, Not permit nesting ef retrieves sut allows nesting o£ 
aggregate functions to cerftcerm the sare operations, 
Although it would be curely subjective to faver one methed 
over tne otner, it arrears tnat the Structured areroacn of 


SGL may be easier toe learn initially. However, once the 
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agaregate functions of RGL are mastered, the lack cf 
redundancy may bre more attractive, 
aeoenes Ditteéerences 

RDM 1100 does not implement nulls, but dees sunpely 
default Vanes. (Zeros for rfumeric fP@las, bianks ter 
character fields). Therefore, tne results of the scalar 
aggregates and aggregate functiors (AVG, MIN, MAK, ang SU) 
are not always predictarle. This implies the user must Fe 
extremely Knewledgeable akout the database and use the 
aggregates with caution (e.¢., explicitly exclude zero 
values from aggregation). In SGL queries can be constructed 
ween "No null" a@s a qualifier and the tuples with null 
Varwes in the* ettribute oGeing aggregated will not tke 
included in the returned value, 

SQL uses ‘ALL’, ‘HAVING’, °IN’, and others to 
provide a more setethecretic descrirfrtion of database 
manipulation. FQL osrovides the same capanility in the 
aggregate functions cut the cencesrt of Set manipulaticn ts 
mot explicit. RQL provides a °MCC’ function and some string 
manipulation functions whicn are also available in SQL. The 
string manipulation functicns extend tne power cf KGL 
particularly when working with database relations (i.e., 
noneuser defined relations) which have attributes encoded as 
binary values, 

The “MCL function is not correctly implemented for 


negative numbers in RGL or SQL, It returns the modulo class 
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of the argument as if the argument were a positive intecer 
amo wttach@s” the original sign. For example, -1 mod @ s 
“(i mod 8) = e1, To avceid this inconsistency and to 
correctly implement the matrematical definition, the 
following nested application cf mod is required for modulo 
83 
mod (mod (ARG, &) + 8, &). 

The actual function implemented aprears to cre a remainder 
function which wOULG B@ "censistent since both atery 
languages are implemented in the frogramming language C, C 


has a remainder function but net a mod function, 


Us 





VI, CONCLUSIONS 


There are three major areas in which CBA must ce 
Knowledgeable in order to ersure the successful management 
of a database system, These areas are tne user services, 
performance enhancements, ard SEGCUFIty factors. The 
specific relational database manaqerent system or tbackena 
machine employed will dictate tne amceunt of CBA sSuppert 
required in eacn area, 

The user services include the Stored commands providea 
by DBA, the loading of data inte the system, the recovery of 
the database as a result of System malfuncticn, and a help 
Facility. Although these are not comprehensive and the 
exact amount cf support willl ce discretionary on the part of 
DBA, they do form the nucleus for DBA’s planning of user 
support. This support is critical to the acceptance and use 
of the relational database system by the user community. 

The basic tools DBA can use te ennance cerformance are 
the implementation of the language features, indices, and 
data placement, tne fertcrmance ennancerent which can ce 
Gained from the query language is only achievable if OBA has 
@ solid understanding cf the language and now it is 
implemented, Certain features of the language will ce 
executed faster than others and since there are numercus 


ways to form a query te ok&tain the same information, 
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knowledge about these characteristics can acnieve mere rariad 
responses from the database, Therefore, DEA should review 
user commands in applications pregrams and provide guidance 
to users for the epurpose of exploiting the more rapia 
features, Of -ocours@, the srecific features will vary 
between languages, 

Indices provide another rerformance tool in databases 
where retrieval and joins are the cbrimary ocerations. Even 
if these operations are not the mest prevalent, indices may 
still ve employed to enhance ferformance, lf tne database 
has a large number of inserticn cperations, then avoiding 
the placement of indices on the relations which are changing 
frequently will not result in serious decradation 
attributable to the index updating. ACGGtetonaliyv, it 
relations in this type of datatase which are not surject to 
frequent insertions but are used in numerous retrieves and 
Joins can be identified; then elacement of indices on these 
relations over the aoproeriate attributes will enhance tne 
overall performance of the datarase System, 

The ability to explicitly clace aata in the database 
should provide amore responsive system. In order to take 
advantage of data flacement CBA rust Know what relations 
exist in the system and which cres are joined ona recurring 
basis. 

The security aspects on a relational catabase system 


Should be a critical issue for CBA. Since a single database 
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will be used ey various users in the organization, there 
will be data which certain greurs of users Go not reauire to 
perform their functions and mere importantly, they shoulda 
not be allowed to access. Altnough there is more to 
security than access contrel, this appears to ke the only 
mechanism available to DBA to implement a security system in 
the database, Consequently, access centrol should re 
employed to restrict tne data available to the users ana 
simultaneously, provide a relational datatase perscective to 
each user, 

In RDM 1100 there is a trade-off between security, 
performance, and relational perscective. There were three 
methods discussed to provide a single external view of the 
database to aouser or greup cf user. Each cf tnese metheds 
required the sacrifice of one of the trade-off features and 
in order to resolve this preslem, a change in the 
implementation of ALL is necessary. 

Tne features and issues discussed in this tnesis should 
provide CBA with some guidelines and topics to investiaate 
which will make nis database system acceptactle and 
responsive to the users, Although the success cr failure of 
any system cannot be realistically placed cn a single 
individual, it aprears DBA will be more responsinle than any 
other person connected with the system 1€ it does not meet 


the users perceived needs, 
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APPENDIX A 


EXAMPLES QF STCREC COMMANCS 


ACCESS,LIST 


PeorROY ACCESS.LIST GO 

DEFINE ACCESS.LIST 

RETRIEVE (RELATION, NAME, RELATICN.ITUPE, 
FIELDS = RELATION, ATTCNT, RECCRES = RELATICN,TUPS, 
USER = USERS. NAME) 


WHERE RELATION,NAME = SQ 


AND PROTECT.RELID RELATICN,FELID 

AND PROTECT.USER = USERS.IC 

AND MCD CINT1 (SUBSTRING (i, 1, PROTECT.ATTMAP)), 4) = 1 
END DEFINE GQ 


ASSOCIATE ACCESS.LIST wITH "RETURNS ACCESS LIST FOR AN 


OBJECT" GC 
ALLe INDICES 
DESTROY ISTATUS GC 
CREATE ISTATUS (STATUS = Ii, CESC = C30) GO 
APPEND TO ISTATUS (STATUS = Q, 
CeSC = "NONUNIGQ@®NORCLUS@#NO DEL SILENT") 


APPEND TO ISTATUS (STATUS = 1, 
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CESC = "UNIGe*NCNCLUS@NQ DEL SILENT") 
APPEND TO ISTATUS (STATUS = 2, 
CESC = "NCNUNIC*CLUS*NOQ DEL SILENT") 
APPEND TO ISTATUS (STATUS = 3, 
CESC = "UNTG*CLUSeNO CEL SILENT") 
APPEND TO ISTATUS (STATUS = 4, 
CESC = "NCNUNIG*NCONCLUS=#CEL SILENT") 
APPEND TO ISTATUS (STATUS = 5, 
DESC = "UNIG*NCNCLUS=DEL SILENT") 
APPEND TO ISTATUS (STATUS = 6, 
Cesc = "NONUNIG*CLUS@CDEL SILENT") 
APPEND TO ISTATUS (STATUS = 7, 
CESC = "UNIG*CLUS*CEL SILENT") 
PERMIT READ CF ISTATUS TO ALL 
DENY WRITE OF ISTATUS TO ALL GC 
DESTROY ALLAINCICES GO 
DEFINE ALL.JINCICES 
Perere VE (REL = RELIANAME CINCICES.RELID), INCICES,INDID, 
PVE CHSeat tCNT, ISTAITUS.DESC) 
CRDER BY RELINAME (INDICES, RELID) 
WHERE ISTATUS,STATUS = MCU (MOU CINDICES,STAT, 8) + 8, 8) 
AND RELATICN,RELID = INDICES,RELID 
AND RELATICN,TYPE = "U" 
END DEFINE GO 
ASSOCIATE ALLJINDICES WITH "LIST ALL INDICES CN USER 


RELAITITENS “« GO 
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PMO Nett GA t 


DESTROY ATTwAINJINEX1 GO 


DEFINE ATTJSINJINDX1 


RETRIEVE CINCICES,.INDID, 


Aid 


ATT2 


ATT3 


ATT4 


ATTS 


ATT6 


ATT? 


ATTS8 


FIELCUNAME 


FIELDNAME 


PIECE SNAME 


FIELCINAME 


FIELDNAME 


FIELDNAME 


FIELDNAME 


FIELDNAME 


WHERE INDICES. INOID 


AND INCICES.RELID 


ENC DEFINE GO 


ASSOCIATE ATTSINSINOX1 WITH 


(INCDICES,RELID, 
4, 1, 
(INDICES.RELID, 
Giant, 
CINCICES.RELID, 
Cant. 
(INCICES.RELID, 
(34, 1, 
C(INEMCES RELIU, 
al, 
C(INCICES.AELID, 
(54, 1, 
(INCICES.RELID, 
oaeae i 
CiNeuGnSoRE LTD, 
G4 at, 

$0 


Rete o (Cot) 
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INTL (SUBSTRING 
PME TCE Sahel lo) )), 
INTL (SUBSTRING 
PHC VERS. hero) )) 
ItTi (SUBSTRING 
INDICES, KEYS))), 
INT] (SUBSTRING 
PNEEGhoO eke to) ) ) ¢ 
INTL (SUBSTRING 
Pee h oe Ne Loo) > 
TNT1 (SUBSTRING 
INDICES, KEYS) )), 
INTL (CSUBSTRING 
PNETCES,hEYS) )), 
INT1 CSUBSTRING 


ENCE GE. REY Oo) 29-2 


PopoloeNaAMea CF ATTRIEUTES In 


INCEX" GC 





Aiea TNL X 2 


DESTROY ATT.INJINEX2 GO 
DEFINE ATT.LINJINDX2 
RETRIEVE CINDICES,INDID, 


AITS 


FIELDNAME CINCICES.RELID, INT1 (SUBSTRING 

(84, 1, INCDICES,KEYS))), 
irOoe= tre LOONAME CINCICES.R&ROID, INTL (SUBSTRING 
(94, 1, INDICES. KEYS))), 


ATT11 


FIELDNAME CINDICES,RELID, INTL (SUBSTRING 
COG sg SNUDICES ,REYS))), 
ATT12 = FIELDNAME CINCICES.RELID, iaNT1 (SUBSTRING 


ATT13 = FIELDINAME CINCICES,RELID, INT1 (CSUESTRING 
ATT14 = FIELDINAME CINCICES,RELID, INT! (SUBSTRING 


C134, 1, INDICES.KEYS))), 
ATT15 = FIELDNAME CINCICES.RELID, INT1 (CSUBSTRING 
C144, 1, INDICES. KEYS))), 
ATT16 = FIELDNAME CINDICES.RELID, INTL (SUBSTRING 
Gled,e, ICNOIGES KEXS)))) 


WHERE INDICES,INDID 


$Q 


AND INCICES,.RELIC 


Ree Csi) 
END DEFINE GC 
ASSOCIATE ATT.LINJINDX2 wITH "LISTS NAMES GF ATTRIBUTES IN 


INCEX" GC 
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CEPENES 


DESTRCY DEPENDS GC 
CeESTRCY OTYPE GC 


CREATE CIYPE (TYPE = UCI, CESC = UC15) GC 


Seen TO OTYPE CTIYPE = "U", CESC = "USER TABLE 1 seo 
Beem TO GTYFE (TYPE = "“S", LESC = PSYSTEM TABLE ace 
Meet 10 OTYPE (TYPE = “T", CESC = “TRANSACTION LOG*) Go 
meee TO OTYPE (TYPE = YF", CESC = “FILE so) Lene) 
fee fo UTYPE (TYPE = "¥", CESC = “USER VIEW Se) GG 


APPEND TO OTYFE (TYPE = "C", CESC = "STGRED COMMAND ") GO 
Oeyy WRITE OF OTYPE GC 
WeNyY READ OF CTYFE GC 
CEFINE DEPENOS 
RETRIEVE (OBJECT = RELATICN.NAME, WHICHLISIA = 
Shee NGmCl a, CIPFA CLESC), CEPENDSJ.ON = $1) 
Moet ChOSOREF,RELIC = RELIJID ($1) 
AND CRCOSSREF,.DRELIC = RELATIUN,RELID 
PMENGLYPE,1YFE = RELATICN.IYFE 
END DEFINE GO 
ASSOCIATE DEPENDS WITH "LISTS CEPENCENCIES OF TRE NAMED 


GEVECT2=Ce 
Pees 
PeerROCY FIELCS GO 


CESTRCY FIELCJEGUIV GC 
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CREATE FIELDSEGUIV (NAME = UC4, NUM = [1) GO 


Se) ee 


MeeeND TO FIELDLECUIV CNAME Ae Oe re Be 


"BIN ", NUM = 45) GC 


APPEND TO FIELCIEGUIV (NAME 
APPEND TO FIELDJZECLIV (NAME = "CHAR", NUM = 47) GO 


"INT ", NUM = 48) GC 


APPEND TO FIELCAEQUIV (NAME 


APPEND TO FIELDJZEQUIV (NAME = "NT ", NUM §2) GO 


u 


APFEND TO FIELCJEQUIV CNAME = "INT ", NUM 36) GU 


PoeiNe FIELDS 
RETRIEVE (TASLE = RELATICN,.NAME, FIELD = ATTRIBUTE.NAME, 
TYPE = FIELDCEGUIV.NAME, LEN = ATTRIEUTE.LEN) 
WHERE ATTRIBUTE.RELID = RELATICN,RELID 
AND RELATICN,NAME = STABLELNAFE 
AND FIELCJEGUIV.NUM = ATIRIEUTE.TYPE 
END DEFINE GO 
ASSOCIATE FIELDS WITH "RETURNS ALL FIELDS IN THE NAMED 


RELATION" GC 





HELP 
eur REL 
CBJECT LINEWNC CESCRIFTIOCN 
ATT.INJINDX1 i THIS IS A STGRED COMMAND WHICH HAS 
2 TwC PARAMETERS, THE FIRST PARA®@ 
3 PEPER ie ihe INDEX ID NO, AND THE 
3 SECOND IS THE RELATION NAME, 
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on 


3 TIRE ee aR AME TERS MUST) BE SEPAR= 


6 ATEC BY CCMMAS, THIS COMMAND 

q PRCVICES TRE ATTRIBUTE NAMES OF 
8 EACH ATTRISUTE IN THE INDEX FOR 
3 THE GIVEN RELATION CR VIEW. TC 
16 EXECUTE TRIS COMMANC JUST TYPE 
et TNS CGE P TFELEGWED BY THE OBJECT 
12 NAME AND "GO", 


DESTROY HELP GO 
DEFINE HELP 
RETRIEVE (CHELPLREL.CESCRIPTICN) 
CRDER BY HELP.IREL.LINEJNC 32 A 
®HERE HELP.LREL.OBJECT = SCEJECTNAME 
BNE eeRee Cl RELIC = KFEDLIO (RELPLREL.CBJECT) 
AND PRCTECT.USER = USERIC () 
AND (MCD CINT1 CSUBSTRING (1, 1, PROTECT,ATTMAP)), 
4) = 1) 
ENC DEFINE GQ 
eermmkl EXECUTE CF HELP TO ALL 
ASSCCIATE HELP WITH "PROVICES INFCRMATION ABCUT THE CBJECT 


FASSED AS A PARAMETER" GO 


INDEXES T 


DESTROY INDEX.LIST GC 


DEFINE INCEXILIST 
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RETRIEVE (CRELATION,NAME, INCICES.INOID, INDICES,ATTCNTI, 
ISTATUS,DESC) 
ORDER BY INDICES.INDIC 
WHERE INDICES,RELID = RELATICN.RELIC 
AND RELATICN.NAME = SO 
AND ISTATUS.STATUS = MCE (MOD CINOICES.STAT, 8) 
+ 8, 6) 
END DEFINE GO 
ASSCCIATE INCEX LIST wITH "LIST INDICES ON NAMED 


RELATICON/VIEW" GO 


PRCTECTICN 


DESTROY PTYPE GC 
DESTRCY ATYPE GC 


CREATE PTYPE CACCESS = Ii, CESC = UC15) GC 


APPEND TO PTYPE CACCESS = 1, DESC = "READ a 
APPEND TC PTYPE (ACCESS = 2, CESC = "“wRITE a 
Berenwco TO PIYPE (ACCESS = 3, OESC = "ALL =) 
APPEND TO PTYPE (ACCESS = 932, CTESC = "EXECUTE 7) 
MmerPeND LTO PTYPE CACCESS = ©53, CESC = "CREATE DATABASE") 
Brew TO OPTYPE (ACCESS = =56, CESC = MCREATE “2 
Meet TO PIYPE CACCESS = e&3, CESC = “CREATE INDEX ") GO 


PERMIT READ OF PTYFE GO 
DENY WRITE OF PTYFE GO 
CREATE ATYPE (ACCESS = I1, CESC = UC8) GQ 


APPEND TO ATYPE (ACCESS = 1, CDESC = "PERMIT ") 
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APPEND TO ATYPE (ACCESS 
APPEND TC ATYPE (ACCESS 
PERMIT READ CF ATYFE GC 
DENY WRITE OF ATYPE GO 
DESTROY PROTECTION GO 
DEFINE PROTECTICN 


RETRIEVE CACCESS = 


CONCAT (ATYPE.CESC, 


= 2, CESC = "DENY 1) 
= 3, DESC = "gCTH ") GO 


PIYCE.CESC), 


CEJECT = RELATICN,NAME, USER = USERS,NAME) 
WHERE ATYPE.ACCESS = MCD CINT1 (SUBSTRING (1, i, 
PROTECT.ATIMAP)), 4) 
AND PTIYFE,ACCESS = FROTECT,ACCESS 
AND RELATICN,RELIO = PRCTECT.RELID 
AND RELATICN,NAME = SQ 
ANC PRCTECT.USER = USERS,ID 
END DEFINE GO 


ASSOCIATE PROTECTICN wITH 


DESTROY TABLES GO 


DEFINE TABLES GC 


RETRIEVE (RELATION.NAME, 


RELATION,ATTCNT, 


"CISPLAY PROCTECTICN DATA ABOUT 


THE NAMED RELATICN" GO 


TAE GES 


RELATICN.TYPE, FIELCS = 


RECORDS = RELATION.TUPS) 


ORDER BY RELATION.NAME 3: A 


WHERE RELATICN,TYPE = 


ENC DEFINE GO 


$Q 
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ASSOCIATE TABLES WITH "RETURNS LIST CF RELATIONS, VIEWS OR 


STCRED CCMMANDS" GC 


WHATIS 


DESTROY WHATIS CO 
DEFINE WHATIS 
RETRIEVE (RELATICN = RELINAME (CESCRIPTIONS,RELID), 
EXPLANATICN = DESCRIPTIONS, TEXT) 
WHERE CESCRIFTIONS,RELIC = REL.ID ($0) 
END DEFINE GO 
ASSOCIATE WHATIS WITH “EXPLAINS WRAL A STCRELC 


CCMMAND/RELATION COES/IS" GO 


wWHCCREATES 


DESTROY WHOCREATES GO 

DEFINE wHOCCREATES 

RETRIEVE (CUSERS.NAME, PIYPE.CESC) 
WHERE PROTECT.USER = USERS.ID 


ANDO (PRCTECT.ACCESS = #53 GR PROTECT,ACCESS 


=56 CR 
PROTECT,ACCESS = =58) 
AND PROTECT.ACCESS = FTYPE.ACCESS 
AND MCC CINTiI CSUBSTRING (1, 1, PROTECT, ATIMAP)), 
4) = 1 
END DEFINE GO 
ASSOCIATE wWHOCREATES WITH "LIST USERS WHO HAVE CREATE 


PERMISSICN" GO 
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